Llama 3 8B Instruct Gradient 1048k
An extended version of Llama-3 8B for long-context processing developed by Gradient, supporting context lengths exceeding 1 million tokens through optimized RoPE theta parameters for efficient long-text handling.
Large Language Model
Transformers English